-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add Data.Fin.Coding and Data.Fin.Enum #1007
base: main
Are you sure you want to change the base?
Conversation
c69d46c
to
3c36b0e
Compare
Mathlib CI status (docs):
|
This still seems asymptotically inefficient and thus not acceptable for actual computation. For instance:
Note that both data structures also need to be computed lazily, so that for instance decoding 0 or in general a small value doesn't require to compute set of partial products or partial sum (which would again be linear time rather than sublinear time, and in fact might be larger than available memory). This might require to use some sort of binary tree or set of thunks of increasing-power-of-2-sized arrays instead or to modify core Lean to provide some sort of ThunkArray primitive to. This is important since computable functions should always be asymptotically optimal, and also because I think the "decide" tactic would end up decoding all possible values, so if those functions are linear, then decide will be quadratic, which is very bad. |
I don't think it's that simple. Most functions don't have the ability to do precomputation, writing
Not exactly. |
@lyphyser I totally agree about potential optimizations. This is partly why this PR is marked WIP. Some of your thoughts are also things I am considering for the final PR. Your optimizations focus on time only. Part of the reason for this PR is that these functions should, in principle, far extend the memory capacity of a single machine and perhaps even physical reality! So memory is actually more of a concern than time for the "generic" implementation. This might sound high brow but there are incredibly large arrays of data out there that could easily overload any kind of local storage situation as you propose. Anyway, as far as time optimization is concerned, I think there could be a more general class that allows decoding gaps, where decode returns |
What do you mean? What does it do? This only computes "precomputed" at most once in most/all programming languages and I would expect it to work in Lean as well under compilation (and also with reduction, why not?) Or do you mean that it could compute "precomputed" less than once/only partially? (which would be good) Perhaps "precomputation" is not the right word for this and "memoization" or "lazy computation" is a better fit.
Both are a concern, which is why I suggested that the precomputed data structures need to be lazily implemented, so that getting item at index i of the data structure only requires to create a data structure of size O(i). Note that I think this means they can't literally be simple arrays, but I think that having a lazy list of thunks of arrays where the i-the array has 2^i items which are also thunks (each referencing the previous thunk) and where the concatenation of arrays is the whole sequence should work. A better solution would to implement a ThunkArray in core that can grow the underlying array lazily as needed. There is the downside that this doesn't support the case where the data produced would exceed available memory but still be computable in the available time; I'm not sure how to support this well and it seems to be less important than making decoding of all/almost all values linear instead of quadratic; it's possible to try to band-aid over this by querying total available memory from the OS and not creating the data structure if it would be too large, although of course this will just make the program hang for more than the user is willing to wait anyway if it was relying on sublinear running time.
But the user might also want to use native_decide (here e.g. I'm talking about deciding statements of the form "for all x in finitely enumerable type, P(x)", which requires decoding all numbers if no enumerator is provided) or otherwise use it in compiled code so it needs to be asymptotically optimal without relying on that, and if it is asymptotically optimal under compilation, then I think it will be under reduction as well as long as no proofs that aren't constant-sized are present (right?).
Mathlib currently has this for infinite types, in the form of Encodable allowing gaps and Denumerable extending Encodable and not allowing gaps. I think it would be good to support this. It would also be good to support an iterable class where elements are computed one-by-one in order, which would be the most efficient way to decide "forall x in finenum, P(x)" statements: this would be like Rust IntoIterator/Iterator, but also including a proof that every value in type is returned at least once, then extended by a stronger interface that also guarantees that they are returned exactly once. But the equivalence-based class in the current pull request should be asymptotically optimal anyway even if those interfaces are present since some use cases might need it and not be able to use other classes. BTW, it would also be good to have more general classes that support efficient encoding/decoding and partial enumeration of infinite countable types in addition to finite types, and ideally also supporting encoding/decoding of an arbitrary countable subset in case of non-countable types (this would be useful if one wants to just choose any N distinct elements computably); some classes expressing that Zero.zero encodes to 0, One.one encodes to 1, Inhabited.default encodes to 0 would also be useful. |
Indeed, that is part of the plan for |
Like I said, it doesn't do what you think. It will be eta-expanded to |
This seems a very serious bug in the compiler, since a transformation like this can change the asymptotic running time of a program for the worse, and hence potentially make the program unusable in practice, which a compiler clearly should never do. More precisely, code should never be moved inside a function, closure or loop unless it's a constant that compiles as either a machine-word-sized immediate or a reference to read-only global data. This is also what any programmer would expect, and it's very surprising to have a compiler behave otherwise. Has it already been filed as an issue? |
@lyphyser This is not a compiler bug. These are rules of the type theory itself! These are all pure functions, so the very idea of precomputation doesn't make sense. To do what you want in Lean, you can either carry a state by wrapping in a monad or pass the precomputed value as an argument everywhere. You can make sure the precomputed value is calculated just once by passing a default value as a |
I believe that this kind of thing is regarded as a compiler bug, or at least a thing worth improving, but the compiler is on hold and has been for years now, so I try not to plan around getting punctual fixes to such issues. One day in the land of the future we will get a new compiler with what amounts to a completely different operational semantics, possibly coupled with new mechanisms for fine tuning the desired behavior, but in the here and now we have a compiler and programs run relative to the operational semantics you can get out of it. I also don't think an issue is the right place to deal with this, because it's a core aspect of the design. The people working on the compiler are very aware of the tradeoffs here, it's hard not to notice this because doing eta expansion in the compiler is not trivial. This was a conscious decision and perhaps they will make a different conscious decision in the new compiler but it will have to be weighed against all the reasons why they chose this design in the first place. |
I think this can and should be fixed without having to redesign or rewrite the compiler; a completely naive and unoptimized compiler does not have this bug, so it's only a matter of disabling or changing the offending transformations. I filed it as lean4#5908. |
You are very much underestimating the amount of work this is. The compiler does this eta expansion in order to preserve a certain invariant about all applications being full applications and all function bodies being in eta-long form, it can't just not do it without breaking those invariants, so every other part of the compiler would have to be audited to ensure it doesn't rely on the broken invariant and patches would have to be inserted in lots of places. |
@lyphyser @digama0 Maybe this discussion should continue at lean4#5908? While of its own independent interest, the main topic has little to do with this PR. |
Just noticed this PR now and it seems very helpful to me; is there anything I can help with to push this to main? |
protected def find? (P : Fin n → Prop) [DecidablePred P] : Option (Fin n) := | ||
foldr n (fun i v => if P i then some i else v) none | ||
|
||
/-- Custom recursor for `Fin (n+1)`. -/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this exactly the same as Fin.inductionOn
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it's also not needed here so it will eventually disappear.
One potential problem with this PR is that a bunch of stuff in Mathlib will need to be refactored to incorporate the new API. Is there a separate Mathlib PR for this? |
This is still WIP. I won't get back to it until January. The consensus is that someone needs to move |
WIP replacement for Mathlib's
FinEnum
class.